46 research outputs found

    Analysing and predicting micro-location patterns of software firms

    Get PDF
    While the effects of non-geographic aggregation on inference are well studied in economics, research on geographic aggregation is rather scarce. This knowledge gap together with the use of aggregated spatial units in previous firm location studies result in a lack of understanding of firm location determinants at the microgeographic level. Suitable data for microgeographic location analysis has become available only recently through the emergence of Volunteered Geographic Information (VGI), especially the OpenStreetMap (OSM) project, and the increasing availability of official (open) geodata. In this paper, we use a comprehensive dataset of three million street-level geocoded firm observations to explore the location pattern of software firms in an Exploratory Spatial Data Analysis (ESDA). Based on the ESDA results, we develop a software firm location prediction model using Poisson regression and OSM data. Our findings demonstrate that the model yields plausible predictions and OSM data is suitable for microgeographic location analysis. Our results also show that non-aggregated data can be used to detect information on location determinants, which are superimposed when aggregated spatial units are analysed, and that some findings of previous firm location studies are not robust at the microgeographic level. However, we also conclude that the lack of high-resolution geodata on socio-economic population characteristics causes systematic prediction errors, especially in cities with diverse and segregated populations

    ISPRS International Journal of Geo-Information / Analyzing and predicting micro-location patterns of software firms

    Get PDF
    While the effects of non-geographic aggregation on statistical inference are well studied in economics, research on the effects of geographic aggregation on regression analysis is rather scarce. This knowledge gap, together with the use of aggregated spatial units in previous firm location studies, results in a lack of understanding of firm location determinants at the microgeographic level. Suitable data for microgeographic location analysis has become available only recently through the emergence of Volunteered Geographic Information (VGI), especially the OpenStreetMap (OSM) project, and the increasing availability of official (open) geodata. In this paper, we use a comprehensive dataset of three million street-level geocoded firm observations to explore the location pattern of software firms in an Exploratory Spatial Data Analysis (ESDA). Based on the ESDA results, we develop a software firm location prediction model using Poisson regression and OSM data. Our findings offer novel insights into the mode of operation of the Modifiable Areal Unit Problem (MAUP) in the context of a microgeographic location analysis: We find that non-aggregated data can be used to detect information on location determinants, which are superimposed when aggregated spatial units are analyzed, and that some findings of previous firm location studies are not robust at the microgeographic level. However, we also conclude that the lack of high-resolution geodata on socio-economic population characteristics causes systematic prediction errors, especially in cities with diverse and segregated populations.(VLID)238648

    Web mining of firm websites : a framework for web scraping and a pilot study for Germany

    Full text link
    Nowadays, almost all (relevant) firms have their own websites which they use to publish information about their products and services. Using the example of innovation in firms, we outline a framework for extracting information from firm websites using web scraping and data mining. For this purpose, we present an easy and free-to-use web scraping tool for large-scale data retrieval from firm websites. We apply this tool in a large-scale pilot study to provide information on the data source (i.e. the population of firm websites in Germany), which has as yet not been studied rigorously in terms of its qualitative and quantitative properties. We find, inter alia, that the use of websites and websites’ characteristics (number of subpages and hyperlinks, text volume, language used) differs according to firm size, age, location, and sector. Web-based studies also have to contend with distinct outliers and the fact that low broadband availability appears to prevent firms from operating a website. Finally, we propose two approaches based on neural network language models and social network analysis to derive firm-level information from the extracted web data

    Predicting innovative firms using web mining and deep learning

    Full text link
    Innovation is considered as a main driver of economic growth. Promoting the development of innovation through STI (science, technology and innovation) policies requires accurate indicators of innovation. Traditional indicators often lack coverage, granularity as well as timeliness and involve high data collection costs, especially when conducted at a large scale. In this paper, we propose a novel approach on how to create firm-level innovation indicators at the scale of millions of firms. We use traditional firm-level innovation indicators from the questionnaire-based Community Innovation Survey (CIS) survey to train an artificial neural network classification model on labelled (innovative/non-innovative) web texts of surveyed firms. Subsequently, we apply this classification model to the web texts of hundreds of thousands of firms in Germany to predict their innovation status. Our results show that this approach produces credible predictions and has the potential to be a valuable and highly cost-efficient addition to the existing set of innovation indicators, especially due to its coverage and regional granularity. The predicted firm-level probabilities can also directly be interpreted as a continuous measure of innovativeness, opening up additional advantages over traditional binary innovation indicators

    The digital layer:alternative data for regional and innovation studies

    Get PDF
    The lack of large-scale data revealing the interactions amongfirms has constrained empirical studies.Utilizing relational web data has remained unexplored as a remedy for this data problem. Weconstructed a Digital Layer by scraping the inter-firm hyperlinks of 600,000 Germanfirms and linked theDigital Layer with several traditional indicators. We showcase the use of this developed dataset by testingwhether the Digital Layer data can replicate several theoretically motivated and empirically supportedstylized facts. The results show that the intensity and quality offirms’hyperlinks are strongly associatedwith the innovation capabilities offirms and, to a lesser extent, with hyperlink relations to geographicallydistant and cognitively closefirms. Finally, we discuss the implications of the Digital Layer approach foran evidence-based assessment of sectoral and place-based innovation policies

    Leveraging the digital layer: the strength of weak and strong ties in bridging geographic and cognitive distances

    Get PDF
    Firms may seek non-redundant information through inter-firm relations beyond their geographic and cognitive boundaries (i.e., relations with firms in other regions and active in different fields). Little is known about the conditions under which firms benefit from this high-risk/high-gain strategy. We created a digital layer of 600,000 German firms by using their websites' textual and relational content. Our results suggest that strong relations (relations with common third partners) between firms from different fields and inter-regional relations are positively associated with a firm's innovation level. We also found that a specific combination of weak and strong relations confers greater innovation benefits

    Microgeography of innovation in the city : location patterns of innovative firms in Berlin

    Full text link
    This paper investigates the micro-location pattern of innovative and non-innovative firms in Berlin using detailed information on the firms’ addresses and their local environment. The study employs a unique, representative panel data set of Berlin-based firms from manufacturing and services covering a five-year period (2011-2015) and applying the standard concepts and measurement approaches used in the Community Innovation Surveys. While controlling for firm size, age and sector, we find product innovators and R&D performing firms located closer to research infrastructures, start-ups and other firms from the same industry. They tend to prefer more dynamic neighbourhoods and avoid very densely populated areas. For process innovators, no significant differences from non-process innovators are found. Firms are more likely to introduce new-to-market innovations if other firms in their direct neighbourhood had introduced such innovations in the previous period, but also if firms with such innovations have moved out of their neighbourhood. The ‘creative environment’ of a firm in terms of bars, cafes, clubs, leisure facilities or cultural locations does not seem to be linked to the innovative activity of firms

    The strength of weak and strong ties in bridging geographic and cognitive distances

    Full text link
    The proximity framework has attracted considerable attention in a scholarly discourse on the driving forces of knowledge exchange tie formation. It has been discussed that too much proximity is negatively associated with the effectiveness of a knowledge exchange relation. However, little is known about the key factors that trigger the formation of the boundaryspanning knowledge ties. Going beyond the “dyadic” perspective on proximity dimensions, this paper argues that the key factor in bridging distances may reside at the “triadic” level. We build on the notion of “the strength of weak ties” and its recent development by investigating the innovative performance and relations of more than 600,000 German firms. We explored and extracted information from the textual and relational content of firms’ websites by using machine learning techniques and hyperlink analysis. We thereby proxied the innovative performance of firms using a deep learning text analysis approach and showed that the triadic property of bridging dyadic relations is a reliable predictor of firms’ innovativeness. Relations embedded in cliques (i.e., strong ties) that connect cognitively distant firms are more strongly associated with firms’ innovation, whereas inter-regional relations connecting different parts of a network (i.e., weak ties) are positively associated with firms’ innovative performance. Also, the results suggest that a combination of strong inter-community and weak inter-regional relations are more positively related with firms’ innovativeness compared to the combination of other relation types

    Epidemic effects in the diffusion of emerging digital technologies:evidence from artificial intelligence adoption

    Get PDF
    The properties of emerging, digital, general-purpose technologies make it hard to observe their adoption by firms and identify the salient determinants of adoption. However, these aspects are critical since the patterns related to early-stage diffusion establish path-dependencies which have implications for the distribution of the technological opportunities and socio-economic returns linked to these technologies. We focus on the case of artificial intelligence (AI) and train a transformer language model to identify firm-level AI adoption using textual data from over 1.1 million websites and constructing a hyperlink network that includes &gt;380,000 firms in Germany, Austria, and Switzerland. We use these data to expand and test epidemic models of inter-firm technology diffusion by integrating the concepts of social capital and network embeddedness. We find that AI adoption is related to three epidemic effect mechanisms: 1) Indirect co-location in industrial and regional hot-spots associated to production of AI knowledge; 2) Direct exposure to sources transmitting deep AI knowledge; 3) Relational embeddedness in the AI knowledge network. The pattern of adoption identified is highly clustered and features a rather closed system of AI adopters which is likely to hinder its broader diffusion. This has implications for policy which should facilitate diffusion beyond localized clusters of expertise. Our findings also point to the need to employ a systemic perspective to investigate the relation between AI adoption and firm performance to identify whether appropriation of the benefits of AI depends on network position and social capital.</p
    corecore